Enhanced graph based approach for multi document summarization
نویسندگان
چکیده
Summarizing documents catering the needs of an user is tricky and challenging. Though there are varieties of approaches, graphical methods have been quite popularly investigated for summarizing document contents. This paper focus its attention on two graphical methods namely-LexRank (threshold) and LexRank (Continuous) proposed by Erkan and Radev. This paper proposes two enhancements to the above work investigated earlier by adding two more features to the existing one. Firstly, discounting approach was introduced to form a summary which ensures less redundancy among sentences. Secondly, position weight mechanism has been adopted to preserve importance based on the position they occupy. Intrinsic evaluation has been done with two data sets. Data set 1 has been created manually from the news paper documents collected by us for experiments. Data set 2 is from DUC 2002 data which is commercially available and distributed or accessed through National Institute of Standards Technology (NIST). We have shown that the based upon precision and recall parameters were comprehensively better as compared to the earlier algorithms.
منابع مشابه
Query-focused Multi-Document Summarization: Combining a Topic Model with Graph-based Semi-supervised Learning
Graph-based learning algorithms have been shown to be an effective approach for query-focused multi-document summarization (MDS). In this paper, we extend the standard graph ranking algorithm by proposing a two-layer (i.e. sentence layer and topic layer) graph-based semi-supervised learning approach based on topic modeling techniques. Experimental results on TAC datasets show that by considerin...
متن کاملA Graph-based Approach to Cross-language Multi-document Summarization
Cross-language summarization is the task of generating a summary in a language different from the language of the source documents. In this paper, we propose a graph-based approach to multi-document summarization that integrates machine translation quality scores in the sentence extraction process. We evaluate our method on a manually translated subset of the DUC 2004 evaluation campaign. Resul...
متن کاملImproved Affinity Graph Based Multi-Document Summarization
This paper describes an affinity graph based approach to multi-document summarization. We incorporate a diffusion process to acquire semantic relationships between sentences, and then compute information richness of sentences by a graph rank algorithm on differentiated intra-document links and inter-document links between sentences. A greedy algorithm is employed to impose diversity penalty on ...
متن کاملAutomatic Multi Document Summarization Approaches
Problem statement: Text summarization can be of different nature ranging from indicative summary that identifies the topics of the document to informative summary which is meant to represent the concise description of the original document, providing an idea of what the whole content of document is all about. Approach: Single document summary seems to capture both the information well but it ha...
متن کاملBuilding Document Graphs for Multiple News Articles Summarization: An Event-Based Approach
Since most of news articles report several events and these events are referred in many related documents, we propose an event-based approach to visualize documents as graph on different conceptual granularities. With graphbased ranking algorithm, we illustrate the application of document graph to multi-document summarization. Experiments on DUC data indicate that our approach is competitive wi...
متن کاملGraph-Based Methods for Multi-document Summarization: Exploring Relationship Maps, Complex Networks and Discourse Information
In this work we investigate the use of graphs for multi-document summarization. We adapt the traditional Relationship Map approach to the multidocument scenario and, in a hybrid approach, we consider adding CST (Crossdocument Structure Theory) relations to this adapted model. We also investigate some measures derived from graphs and complex networks for sentence selection. We show that the supe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 10 شماره
صفحات -
تاریخ انتشار 2013